Gender Identification Using Svm with Combination of Mfcc

نویسندگان

  • SANTOSH GAIKWAD
  • BHARTI GAWALI
چکیده

Gender is an important and most diffrentiative characteristic of a speech. Gender information can also be used to improve the performance of speech and speaker recognition systems. Automatic gender classification is a technique that aims to determine the sex of the speaker through speech signal analysis. However with the increase in biometric security application, practical application of gender identification increased the many fold .The need of gender identification from speech arises several situation such as sorting telephonic call. Many methods of gender identification have been proposed in literature. We implemented the gender classification method and gender dependant feature such as pitch, roll of and energy in combination with MFCC. The clustered approach of above said parameter is implemented using SVM. We also present the experimental result of the proposed approach .It is observed that the accuracy of gender identification system is improved on the basis of size of codebook .The high accuracy is got at 25 codebook size with greater time slice. The accuracy of system tested with respective to gender and age .The efficient recognition rate of 95% is achieved in the age group of 25-30. KeywordsGender Identification, Pitch, Energy, MFCC, SVM Advances in Computational Research ISSN: 0975-3273 & E-ISSN: 0975-9085, Volume 4, Issue 1, 2012 Introduction Gender identification based on the voice of a speaker consists of detecting a speech signal uttered by a male or a female. Automatically detecting the gender of a speaker has several potential applications. In the context of Automatic Speech Recognition, gender dependent models are more accurate than gender independent ones. Hence, gender recognition is needed prior to the application of one gender dependent model. In the context of speaker recognition, gender detection can improve the performance by limiting the search space to speakers from the same gender. Also, in the context of content based multimedia indexing the speaker’s gender is a cue used in annotation. Therefore, automatic gender detection can be a tool in a content-based multimedia indexing system. This paper describes an approach for voicebased gender identification for audio-visual content-based indexing. Several acoustic conditions exist in audio-visual data such as compressed speech, telephone quality speech, noisy speech, speech over background music, studio quality speech, different languages, and so on. Gender identification system must be able to process this variety of speech conditions with acceptable performance. Gender identification is an important step in speaker and speech recognition systems [1-4]. In these systems, the gender identification step transforms the gender independent problem into a gender dependent one, thus it can reduce the size and complexity of the problem. [5, 6, 8, 9]. For speech signal based on gender identification, the most commonly used features are pitch period and Mel-Frequency Cepstral Coefficients (MFCC) [10]. The main intuition for using the pitch period comes from the fact that the average fundamental frequency (reciprocal of pitch period) for men is typically in the range of 100-146 Hz, whereas for women it is 188-221 Hz [11]. However, there are several challenges while using pitch period as the feature for gender identification. First, a good estimate of the pitch Citation: Santosh Gaikwad, Bharti Gawali and Mehrotra S.C. (2012) Gender Identification Using SVM with Combination of MFCC. Advances in Computational Research, ISSN: 0975-3273 & E-ISSN: 0975-9085, Volume 4, Issue 1, pp.-69-73. Copyright: Copyright©2012 Santosh Gaikwad, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative weight training-based optimally weighted MFCC for gender identification

In this paper, we apply a discriminative weight training to a support vector machine (SVM) based gender identification. In our approach, the gender decision rule is derived by the SVM incorporating the optimally weighted mel-frequency cepstral coefficient (MFCC) based on a minimum classification error (MCE) method which is different from the previous works in that optimal weights are differentl...

متن کامل

Combining five acoustic level modeling methods for automatic speaker age and gender recognition

This paper presents a novel automatic speaker age and gender identification approach which combines five different methods at the acoustic level to improve the baseline performance. The five subsystems are (1) Gaussian mixture model (GMM) system based on mel-frequency cepstral coefficient (MFCC) features, (2) Support vector machine (SVM) based on GMM mean supervectors, (3) SVM based on GMM maxi...

متن کامل

University of the Basque Country + Ikerlan System for NIST 2009 Language Recognition Evaluation

This paper briefly describes the language recognition system developed by the Sofware Technology Working Group (http://gtts.ehu.es) at the University of the Basque Country in collaboration with IKERLAN Technological Research Center, and submitted to the NIST 2009 Language Recognition Evaluation. The system consists of a hierarchical fusion of individual subsystems: two acoustic GLDS-SVM systems...

متن کامل

Exploring Kernels in Svm-based Classification of Larynx Pathology from Human Voice

In this paper identification of laryngeal disorders using cepstral parameters of human voice is investigated. Mel-frequency cepstral coefficients (MFCC), extracted from audio recordings, are further approximated, using 3 strategies: sampling, averaging, and estimation. SVM and LS-SVM categorize preprocessed data into normal, nodular, and diffuse classes. Since it is a three-class problem, vario...

متن کامل

Robust Text Independent Speaker Identification Using Hybrid GMM-SVM System

This paper introduces and motivates the use of the statistical method Gaussian Mixture Model (GMM) and Support Vector Machines (SVM) for robust textindependent speaker identification. Features are extracted from the dialect DR1 of the Timit corpus. They are presented by MFCC, energy, Delta and Delta-Delta coefficients. GMM is used to model the feature extractor of the input speech signal and SV...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012